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Abstract 

Let X be a compact, smooth, connected, Riemannian manifold without boundary, G:XxX^R 
^\ ' be a kernel. Analogous to a radial basis function network, an eignet is an expression of the form 

Sj=i a jG(o,yj), where Oj £ R, yj € X, 1 < j < M. We describe a deterministic, universal algorithm 
for constructing an eignet for approximating functions in i p (/i;X) for a general class of measures /x 
and kernels G. Our algorithm yields linear operators. Using the minimal separation amongst the 
centers yj as the cost of approximation, we give modulus of smoothness estimates for the degree of 
approximation by our eignets, and show by means of a converse theorem that these are the best 
possible for every individual function. We also give estimates on the coefficients aj in terms of the 
norm of the eignet. Finally, we demonstrate that if any sequence of eignets satisfies the optimal 
' estimates for the degree of approximation of a smooth function, measured in terms of the minimal 

separation, then the derivatives of the eignets also approximate the corresponding derivatives of the 
target function in an optimal manner. 
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1 Introduction 



In recent years, diffusion geometry techinques have developed into a powerful tool for analysis of a 
nominally high dimensional data, which has a low dimensional structure, for example, it lies on a low 
■ dimensional manifold in the high dimensional ambient space. Applications of these techniques include 

document analysis [7j , face recognition [18] , semi-supervised learning [2j Q] , image processing [12] , and 
cataloguing of galaxies [13]. The special issue of Applied and Computational Harmonic Analysis 
Q\ | contains several papers that serve as a good introduction to this subject. 

An essential ingredient in these techniques is the notion of a heat kernel K t on the manifold X in 
question, which can be defined formally by 



I Kt(x,y) = '^exp(-tjt)(f>j(x)(f> j (y), t > 0, x,y <E 



where {<fij} is an orthonormal basis for L 2 (/«;X) for an appropriate measure /i, and Ij's are nonnegative 
numbers increasing to oo as j — ► oo. A multiresolution analysis is then defined by Coifmann and Maggioni 
[7] for a fixed e > by defining the increasing sequence of scaling spaces 

span {0 fc : exp(-2^4) > e} = span {<p k : l\ < (2» log(l/e))}. 

The range of the operators generated by K 2 -j being "close" to the space at level j, one may obtain 
an approximate projection of a function by applying these operators to the function. In turn, these 
operators can be computed using fast multipole techniques. The diffusion wavelets and wavelet packets 
can be obtained by applying Gram Schmidt procedure to the kernels K 2 -j- On a more theoretical side, 
Jones, Maggioni, and Schul [2D] have recently proved that the heat kernel can be used to construct a 
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local coordinate atlas on manifolds, preserving the order of magnitude of the distances between points 
within each chart. 

Since an explicit formula for the heat kernel is typically not known on all but the simplest of manifolds, 
in numerical implementations, one considers in place of the heat kernel an approximation by means of 
a suitable radial basis function, typically a Gaussian. The error in this approximation is investigated in 
detail by several authors, for example, [13J [3TJ [3J H] . In a different idea, Saito [30] has advocated the use 
of other kernels which commute with the heat kernel, and hence, share the invariant subspaces with it, 
but for which explict formulas are known. 

Several applications, especially in the context of semi-supervised learning, signal processing, and 
pattern recognition can be viewed as problems of function approximation. For example, given a few 
digitized images of handwritten digits, one wishes to develop a model that will predict for any other 
image whether the corresponding digit is 0. Each image may be viewed as a point in a high dimensional 
space, and the target function is the characteristic function of the set of points corresponding to the digit 
0. We observe in this context that even though K, t f — > / (uniformly if / is continuous) as t — > 0, where 
JCt is the heat operator defined by the kernel K t , the rate of convergence provided by this simple minded 
approximation cannot be the optimal one for smooth functions, since the JC t 4>j ^ 4>j except when £j = 0. 
In this paper, for L > 0, an element of 11^ := span {cf>j : £j < L} will be called a diffusion polynomial 
of degree at most L, as in [25]. In [551 121], we have developed a different multiscale analysis based on 
n 2 i as the scaling spaces. We have obtained a Littlewood-Paley expansion, valid for functions in all LP 
spaces including p = 1, oo. This expansion is in terms of a tight frame transform, which can be used to 
characterize different Besov spaces related to approximation by diffusion polynomials. Our tight frames 
can also be chosen to be highly localized. 

The main objective of this paper is to consider the approximation properties of a generalized transla- 
tion network of the form YljLi &jG{o, yj), where G is a fixed kernel, G : X x X — > M, M > 1 is an integer 
(the number of neurons), the coefficients a^'s are real numbers and the centers y^s are distinct points 
in X. We will deal with kernels of the form G(x,y) = Yl'jLo b{£j)4>j{%)<f>j{y)- For this reason, we will call 
the network an eignet. This paper is the first part of a two part investigation. In this paper, we consider 
the case when {b(£j)£P} remains bounded as j — > oo; in a sequel, we plan to develop analogous theory 
for the case when {b(£j)} tends to exponentially fast as j — * oo, in particular, including the case of the 
heat kernel itself as G. 

To explain our objectives in further detail, we describe first the general paradigm in approximation 
theory. Typically, one considers a metric space X and a nested, increasing sequence of subsets of X: 
Vq C V\ C • • • V m C V m +i C • • •. Elements of V m provide a model (approximant) for a target function 
f G X; the index m is typically related to the model complexity. The density theorem is a statement 
that U^ =Q V m is dense in X. Let d(X;f,g) denote the distance between f,g s X. A deeper, and 
central problem of approximation theory is to investigate the rate at which the degree of approximation, 
dist (X; f, V m ) := infp e y m dist (X; f, P), converges to as m — > oo, depending upon certain conditions 
on /. These conditions are encoded by a statement that / 6 W for a subset W C X, usually called a 
smoothness class. In the most classical example, the trigonometric case, X is the space of all continuous, 
27r-periodic functions on K, equipped with the supremum norm on [—it, it], and V m denotes the class of all 
trigonometric polynomials of order at most m; i.e., expressions of the form X)|j|<m o-je lJO . The well known 
equivalence theorem in this case states [8J that if < a < 1, and r > is an integer, then dist (X; f , V rn ) = 
0{m~' r ~ a ) if and only if / has r continuous derivatives and \f^ r '(x) — f^(y)[ = 0{\x — y\ a ), x,y € M. To 
cover the case when a = 1 is allowed, one needs to introduce higher order moduli of smoothness; a more 
modern approach is to consider K functionals. We observe that this theory is applicable to individual 
functions, rather than being an assertion about the existence of a function to demonstrate that the rate at 
which the degree of approximation converges to zero cannot be improved. In the general case, of course, 
the interesting questions are to determine what one should mean by the model complexity, and what 
smoothness classes are characterized by a given rate of convergence of dist {X; f, V m ) to as m — > oo. In 
the context of approximation by Gaussian networks, we have demonstrated in [27U26] that a satisfactory 
theory can be developed by using the minimal separation amongst the centers as the measurement of 
model complexity, with the smoothness classes defined in terms of certain weighted Besov spaces. 

The main goal of this paper is to demonstrate equivalence theorems of approximation theory in the 
case of eignets, where the complexity of the model is measured by the minimal separation amongst the 
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centers and the smoothness of the target function is measured by a suitable K functional as in [25]. In 
this paper, we will show that the smoothness classes characterized by the degrees of approximation by 
eignets with minimal separation q amongst the centers are the same as those characterized by the degrees 
of approximation by IT/^, q — > 0. 

There are several consequences of our approach, which we find interesting. First, we will give an 
explicit, stable, construction of an eignet, which is universal in the sense that it is defined for every 
function in LP (or every continuous function, depending upon the data available for the function). At the 
same time, the approximation error for any individual function in a smoothness class is commensurate 
with the degree of approximation by the class of all eignets with the same minimal separation amongst the 
centers. Our operator will automatically minimize (up to a constant multiple) a regularization criterion, 
but does not require the solution of an optimization problem to achieve this. 

Second, for an arbitrary eignet, we will estimate the size of the coefficients in terms of the norm of the 
eignet itself. This estimate will be in terms of the minimal separation amongst the centers. In particular, 
if one wishes to interpolate using eignets, our result gives an estimate on the stability of the interpolation 
matrix. Finally, we will consider the question of simultaneous approximation: if ^ is an arbitrary eignet, 
and one knows an upper bound for ||/ — *f?\\ p , we estimate the error ||(A*) r / — (A*) r 5'|| p , where A* is a 
pseudo-differential operator. 

One of the referees has pointed out kindly that our work here has several potential applications: signal 
processing, Paley Wiener theorems in inverse problems, computer vision, imaging, geo-remotc sensing, 
among others, and that further hints can be found in [TT1 [9l flOl [33l [34] . 

The paper is organized as follows. In Section [2] we will describe the general set up, including the 
conditions on the manifold, the system {4>j}, the kernel G, etc., including some basic facts. The main 
results are described in Section [3] The proofs of these results involve a great deal of estimations involving 
many sums and integrals. These estimations being very similar, we prefer to present them concisely in a 
somewhat abstract setting. This setting and the appearance which the various objects in Section [3] take 
is explained in Section |U Several preparatory lemmas and propositions of a technical nature are proved 
in Section [5] In Section [6] we use these to prove the new results in Section [3] In a first reading, one may 
wish to skip Section [5] and refer back to it as needed from Section [6] 

We thank the referees and the editor for their many valuable suggestions for the improvement of 
the first draft of this paper. We thank Jiirgen Prestin and Frank Filbir for their encouragement and 
discussions during the preparation of this paper. 



2 The set up 

Our results in this paper involve a number of objects: the Riemannian manifold X, the geodesic distance 
p on X, a measure p, on X, the system {<frj}, the sequence {tj}, the kernel G for the eignet, etc. In this 
section, we introduce the notations and various assumptions on these objects. 

2.1 The manifold 

Throughout this paper, X is assumed to be a (C°°) smooth, compact, connected, Riemannian manifold, 
p denotes the geodesic distance on X, ji is a fixed probability measure on X, not necessarily the manifold 
measure on X. For x € X, r > 0, let 

B(x.r) := {y G X : p(x, y) < r}, A(x, r) = X \ B(x, r). 

We assume that there exists a > such that 

p(B{x,r)) < cr a , a; € X, r > 0. (2.1) 

Here, and in the sequel, the symbols c, c±, ■ ■ ■ will denote generic positive constants depending only on 
the fixed parameters in the discussion, such as p, p, the system {4>k}, and the norms, etc. Their value 
may be different at different occurrences, even within a single formula. The notation A ~ B means that 
ciA < B < c 2 A. 
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If X C X is /i-measurable, and / : X — > C is a /i-mcasurable function, we will write 

ll/lk P := 



i/p 

|/(ar)| p dAt(a:) J> , if 1 < p < oo, 



/i — ess sup xeX iff) = 00. 

The class of all / with |[/[[x,p < 00 wm be denoted by L P (X), with the usual convention of considering 
two functions to be equal if they are equal /x-almost everywhere. If X — X, we will omit its mention 
from the notations. For 1 < p < 00, we define p' = p/(p — 1) with the usual understanding that 1' = 00, 
00' = 1. If /1 G LP, f 2 G L p then 



</l,/2> := / fi(x)f 2 (x)d»(x). 
Jx 

If / G L p , W C iP, we define 

dist (p;f,W) := inf ||/-P|| P) 

an abbreviation for dist (L p \ /, W). 

Let {^j} be an orthonormal system of functions in L 2 , such that each 4>j is continuous on X (and hence, 
both integrable and bounded). We assume that <f>o(x) = 1 for i£l Let be a nondecreasing sequence 
of real numbers such that £0 — 0, £j | 00 as J — > 00. For i > 0, we write LI^ := span : < i}. An 
element of LIoo := Ul>oII£ will be called a diffusion polynomial. For P £ LIoo, the degree of P is the 
minimum integer L such that P E 11/,. The L p closure of LIoo will be denoted by X p . 

For t > 0, 6 X, we define the heat kernel on X formally by 

00 

K t (x,y) =J2exp(-e j t)tf> j (x)4> j (y)- ( 2 - 2 ) 
3=0 

Although K t satisfies the semigroup property, and 

K t (x,y)d^{y) = l, xeX, (2.3) 



K t may not be the heat kernel in the classical sense. In particular, we do not assume that K t is 
nonnegative. The only assumptions we make on K t are the following: With a > as in (|2.1[) . 

\K t (x,y)\ < Cl t- a / 2 eM-c P (x,y) 2 /t), t e (0, 1], x,y e X, (2.4) 

and for any of the first order directional derivatives d with respect to a normal coordinate system, 

\d y K t (x,y)\ < c 1 ^ Q / 2 - 1 exp(~cp(x, 2 /) 2 /i), * E (0, 1], s, y £ X. (2.5) 

We note that our assumptions imply that K t (x,y) is well defined for all x,y G X and t G (0,1]. It is 
proved in [14] that (|2.4p implies that 

<t>j( x ) <cL a 1 L > 0. (2.6) 

In the case when f/^'s (respectively, ik's) are the eigenfunctions (respectively, eigenvalues) of the square 
root of the negative Laplacian on X, the assumptions (|2.4|) and (|2.5[) can be deduced from the bounds on 



the spectral functions X^<l Sf J <i(^ < / > j) 2 ( a; ) proved by Bin Xu [32] (cf. O), and the finite speed 

of wave propagation. Kordyukov [22| has proved similar estimates in the case when X has bounded geom- 
etry, and (/>/c's are eigenfunctions of a general, second order, strictly elliptic partial differential operator. 
Other examples, where \i is not the Riemannian measure on X are given by Grigoryan in [17] . 

The bounds on the heat kernel are closely connected with the measures of the balls B(x,r). For 
example, it is proved in [IT] that the conditions (|2.3p . (|2.1[) . and (|2.4|) imply that 

fi(B(x, r)) > cr a , < r < 1, i£i (2.7) 
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In view of (|2.1j) . this shows that p satisfies the homogeneity condition 

(j,(B(x,R)) <c(R/r) a p,(B(x,r)), x e X, r e (0, 1], R > 0. (2.8) 

In many of the examples cited above, the kernel K t also satisfies a lower bound to match the upper bound 
in (|2.4p . In this case, Grigoryan [17] has also shown that (|2.ip is satisfied. 

In the case when X is the Euclidean sphere, or the rotation group 5*0(3), the eigenfunctions of the 
Laplace-Beltrami operator are polynomials, and hence, if 11^ is span of the appropriate eigenfunctions, 
Pi,P2 € ILl imply that P1P2 £ Hsl- We are not aware of any concrete examples where this is not true. 
In general, when Vl is a span of eigenfunctions of certain elliptic operators, we do not expect such a 
precise inclusion. Nevertheless, each of the products <f>j<j>k is infinitely often differentiable in this case, 
and hence, it is reasonable to expect that dist (00; 4>j4>k> n m ) — > faster than any polynomial in 1/m as 
m — > 00. Since we are considering an even more general situation, where (f>j, 4>k are not assumed to be 
eigenfunctions of any elliptic operator, we need to make the following assumption as our substitute for 
the lack of an algebra structure on IIoo . 
Product assumption: 
Let A > 2 be a fixed number, and for L > 0, 

e L := sup d\st (00; 4> 3 (j) k , Hal)- (2.9) 
i 3 ,e k <L 

We assume that L c e^ — > as L — > 00 for every c > 0. We conjecture that if X is an analytic manifold 
and fa's are eigenfunctions of elliptic partial differentiable operators with analytic coefficients, then 

,. l/L 

hmsup L ^ oc e L < 1. 

To summarize, our assumptions on the manifold, the measure, and the systems {4>k}, {@k} are: (|2.ip . 
(|2.3[) . (|2.4[) . (|2.5p . and the product assumption. 

2.2 Data sets and weights 

Let K C X be a compact set, C c -K" be a finite set, . The mesh norm 5(C, K) of C relative to K and the 
minimal separation q(C) are defined by 

5(C,K) = sup p(x,C), q(C) = min p(x,y). (2.10) 

To keep the notation simple, we will write 5(C) := 5(C, X). Of particular interest in this paper are sets C 
satisfying 

5(C) < 2q(C). (2.11) 

The proof of the following proposition shows one way to construct such sets from arbitary finite subsets 
of X. Consistent with our policy of presenting all proofs in Section [51 this proof will be postponed to the 
end of this paper. 

Proposition 2.1 (a) If C C X is a finite set and e > 0, there exists C C C such that 5(C,C) < e < q(C). 
In particular, for the set C obtained with e = 5(C), 5(C) < 5(C) < 25(C) < 2q(C). 

(b) If Co C Cj c X are finite subsets with 5(C\) < (1/2)5 (Cq) < q(Co), then there exists C\, with 
Co Q CI C C u such that 5 (d) < 5 (C{) < 25 (d) < 2q(C{) . 

(c) Let {C rn } be a sequence of finite subsets of X, with 5(C m ) ~ 1/m, and C m C C TO +i, m = 1,2, 
Then there exists a sequence of subsets {C rn C C m }, where, for m = 1, 2, • • • , 5(C m ) ~ 1/m, C rn C C TO +i, 
5(C m ) < 2q(C 

In the sequel, for any finite subset C (respectively, C m ), we will only work with the subset C (respec- 
tively, C m ) as constructed above. Since the rest of the points in C (respectively, C m ) are ignored in our 
analysis, we may rename this subset again as C (respectively, C m ) and assume that C (respectively, C rn ) 
satisfies (|2~TT|) . 

The following theorem is proved in [14], where do not need the product assumption. 
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Theorem 2.1 Let C be a finite subset of X (satisfying 112.11)) ). 8(C) < 1/6. We assume further that 
1)2.1)) . A2.3)) , ft2.4\ ), and A2.5)) hold. Then there exists c > such that for L < c8(C) , we have 

\\P\\ 1 <2j2»(B(x,8(C)))\P(x)\ <ci||P||i, Pen L . (2.12) 

Consequently, for L < c^(C)^ 1 , there exist numbers w x , i£C, such that for each x G C, 

Kl < C2/i(B(M(C))) < <**(C) a < c 4g (C) Q , (2.13) 

and 

f P(y)dn(y) = w =cP(x), P G n L . (2.14) 

A simple way to find the weights w x is to solve the least square problem of minimizing ^ w\ with 
the constraints YlxeC w x4>k(x) = J x 4>kd^,, k — 0, • ■ • , L |24| . Alternately, one may obtain w^'s so as to 
minimize 



i k <L \xec Jx J 



Efficient numerical algorithms for computing the weights in the context of the unit sphere can be found, 
for example, in [24l [2T1 [T5] , Some of these ideas can be adopted in this context, but our main focus in 
this paper is of a theoretical nature, and we will not comment further on this issue in this paper. 
In view of (|2.7p . (|2.1[) , the inequalities (|2.12|) can be formulated as 

||P||i <c iq (C) a J2\ p (x)\ <c 2 8(C) a J2\ p (x)\ <c 3 J2»(B(x,S(C)))\P(x)\ < c 4 ||P||i, P G lit,. 

x£C x£C x£C 

(2.15) 

Inequalities of this nature were proved in the trigonometric case by Marcinkicwicz and Zygmund [35[ 
Chapter X, Theorem 7.28]. For this reason, we will refer to (|2.15p as MZ inequalities. 

Definition 2.1 Let C C X be a finite set, a y , y G C be real numbers, and d > 0. We will say that {a y } is 
d-regular if for some constant c depending only on X and the related quantities described in Section )2.1[ 
but not on C, r, or d, such that 

\a y \ <c{fi(B( Xl r)) +d a }, xGX, r > 0. (2.16) 

yeCnB(x,r) 

If L > 0, we will say that {a y } is a set of quadrature weights (or equivalently, a y 's are quadrature weights) 
of order L corresponding to C if 



f P(y)dfx(y) =J2^ v P(y), Peu L . 
Jx y ec 



Thus, for example, the set {w x } x< zc constructed in Theorem 12.11 is a 1/X-regular set of quadrature 
weights of order L corresponding to C. We will show in Lemma T5.3I below that the sets {a y } y ^c, where 
each a y = fi(B(y,S(C))) (respectively, 8(C) a , q(C) a ) are all 8(C)- or g(C)-regular, but of course, not 
quadrature weights. 

2.3 Eignets 

The notion of eignets, analogous to the notion of radial basis function (RBF)/neural networks, is defined 
as follows. 

Definition 2.2 Let C C X be a finite set, and G:XxX->R. An eignet with centers C and kernel G is 
a function of the form J2 v ec a yG(°, y), where the coefficients a y G M, y G C. The set of all eignets with 
centers C will be denoted by Q(C) = Q(G;C). 



(i 



We note that G(C) is a linear space. In the parlace of the theory of RBF/neural networks, the kernel G 
may be thought of as the activation function. 

As mentioned in the introduction, we are interested in this paper in the case when the kernel G admits 
a formal expansion of the form G(x,y) = T^jLo b(£j)4>j(x)4>j(y), where the coefficients b(£j) behave like 
lj 13 for some (3 > 0. (This is the reason for our terminology "eignet" , to emphasize the formal expansion 
in terms of what would usually be eigenfunctions of the Laplace-Beltrami operator on a manifold.) The 
following definition makes this sentiment more precise. In the sequel, S > a will be a fixed integer. 

Definition 2.3 Let [3 G R. A function b : R — > R will be called a mask of type [3 if b is an even, S times 
continuously differentiate function such that for t > 0, b{t) = (1 + t)~^ FbQogt) for some F b : R — » R 
such that \F b {k) {t)\ < c(b), t 6 R, fc = 0, 1, •• -,S, and F b (t) > a{b), t e R. A function G:XxX^R 
will be called a kernel of type (3 if it admits a formal expansion G(x,y) = 52j=o H^j)4'j( x )4 > j{y) f or some 
mask b of type (3 > 0. // we wish to specify the connection between G and b, we will write G(b;x,y) in 
place of G. 

We observe that lim^^oo i*b(t) = b(Q) is finite. Further, the definition of a mask of type (3 can be 
relaxed somewhat, for example, the various bounds on Ff, and its derivatives may only be assumed for 
sufficiently large values of \t\ rather than for all tel. If this is the case, one can construct a new kernel 
by adding a suitable diffusion polynomial (of a fixed degree) to G, as is customary in the theory of radial 
basis functions, and obtain a kernel whose mask satisfies the definition given above. This does not add 
any new feature to our theory. Therefore, we assume the more restrictive definition as given above. 

For a S times continuously differentiable function F, we define 

|||F||| S := sup \Fto(x)\. 

0<fc<S>GR 



Let b be a mask of type (3 £ 
by induction that 



In the sequel, if L > 0, we will write &£,(£) = b(Lt). It is easy to verify 



^Tr((l + iAW) 



Jk 
k a 

dt k 



FbQogt) 



< c(b)c 2 , t>0, k = 0,---,S, 



and hence, 



t k — {{l/L + tfb L {t)) 



< c{b)c 2 L-P, t > 0, k = 0, • • • , S, L > 0. 



Since bit) 1 is a mask of type —13, we record that 



(2-17) 



t k — ({l/L + tfb L {t))^ 



< cib^L 13 , t>0, k = 0, • • • ,S, L > 0. 



(2.18) 



Finally, if g : R — > R is any compactly supported, S times continuously differentiable function, such that 
g(t) = on some neighborhood of then (|2.17|) . (|2.18[) imply 



\\\gb L \\\s < c{b,g)L~P, \\\g/b L \\\ s < c(b,g)L^ 



L > 1. 



(2.19) 



3 Main results 

In the remainder of this paper, we fix a number (3 > 0, a mask b of type f3, and the corresponding 
kernel G. Our main goal in this paper is to construct eignets for approximation of functions in X p and 
develop an equivalence theroem for approximation by these. In comparison with the approximation theory 
paradigm described in the introduction, we choose X p as the metric space in which the approximation 
takes place. We consider a nested sequence {C m } of finite subsets of X, each satisfying (|2. 1 1[) . and such 
that q(C m ) ~ S(C m ) ~ 1/m, m = 1,2, We let V m be the space G(C m ). Clearly, V m C V m +i for 
m = 1,2, If (3 > alp', we will show in Proposition 15.21 below that each V m C X p . Our initial 
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choice of smoothness classes is the following. If / 6 L 1 + L°° and r > 0, we define formally (A*)' / by 
((A*) r f,4> k ) = (1 +4) r (/,0fc), k = 0,1,---. Let W r p be the class of all / such that (A*) r / e X p . 
It is proved in [35] (cf. Proposition 15.31 below) that for / 6 Wj? an d £ > 0, 

dist (p;/,n i )<cL-ni(A*)7|| p . 

Thus, our goal is to approximate a diffusion polynomial in 11^ by eignets, keeping track of the errors. For 
this purpose, we need another pseudo-differential operator. 

Definition 3.1 The operator T> = Vq is defined formally by (Df, <f> k ) = (/, <f>k)/b(£k), k = 0, 1, • ••. 
Clearly, T>q is defined on Tl^ , and it is easy to verify the fundamental fact that 

P{x) = / (V G P)(y)G(x,y)dfi(y), P e n M , x e X. (3.1) 



Our eignets will be discretizations of the integral above. Thus, if C C X is a finite set, and W = {w y } ye c 
are some real numbers, we define 

G(C;W;P,x) := G(G;C;W; P, x) :=^w y {V G P){y)G{x,y), P e Hoc, x £ X. (3.2) 

We note that G defines a linear operator on IToo. 

Our strategy is to approximate a target function / 6 W$ first by a diffusion polynomial P € Ul so 
that \\f-P\\ p = 0(L- r ). With a careful choice ofC and W, we will then show that \\P- G(C; W; P)\\ p = 
(D(L~ r ). The results are formulated below as our first theorem. We recall the constant A > 2 described 
in the "product assumption" in Section |2~T1 



Theorem 3.1 Let C* C X be a finite set satisfying 112.11)) , L ~ q{C*) 1 , W* be a 1/ 1 L -regular set of 
quadrature weights of order 2AL corresponding to C* . Let 1 < p < oo, [3 > a/p 1 , < r < [5. Let f € Wf, 
and P e II L satisfy \\f - P\\ p < cL- r ||(A*) r /|| p . Then 

||/ - G(C*; W;P)|| P < Cl L- r \\(A*) r f\\ p , (3.3) 

We comment on the construction of the diffusion polynomial P in the above theorem. In the sequel, 
we let h : K — > M be a fixed, infinitely differentiable, and even function, nonincreasing on [0, oo), such 
that hit) = 1 if \t\ < 1/2 and h(t) = if \t\ > 1. We will omit the mention of h from the notation, and 
all constants c, c\, ■ ■ ■ may depend upon h. We define 

oo 

o- L (f,x):=o- L {h;f,x):=J2 h (tk/L)(f,<f> k )<t> k {x), L > 0, x e X, f e L 1 + L°° . (3.4) 

It is proved in [25 (cf. Proposition EH below) that ||/ - <J L (f)\\ p < cL- r ||(A*) r /|| P , L > 0. Thus, if 
{f : 4>k) are known (or can be computed) for l k < L, we may take ctl(/) in place of P in Theorem 13.11 
However, if / £ X°° and only the values of / at finitely many sites C are known, then we may adopt 
the following procedure instead. First, we consider L (depending upon 5(C)) such that Theorem 12.11 is 
applicable, and yields a 1/L-regular set of quadrature weights W = {w y } ye c of order 2AL. We then 
define 

a L (C; W; /, x) := £ w y f(y) j h(£ k / L)Mv)M^) \ = £ ^ 4 / L ) \ E w vf(v)My) \ M^), 

y£C {k=Q ) k=0 [yGC J 

(3.5) 

which is similar to ct_l(/), except that the inner products (/, <p k ) are discretized using the quadrature 
weights. We will prove in Proposition 15. 31 below that 



\f-a L {C;W; < cZT^U/Uoo + ||(A*) r /lloo}, / G W™, L > 1. (3.6) 
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Thus, <Jl(C; W; /) can also be used in place of P in Theorem 13.11 in the case when p = oo to obtain the 
bound 



||/ -G(C*;WV L (C;W; /))!!«,< cL -r {||/l|oo + ||(AT/l|oc}, /e^ L>1. (3.7) 

We may choose C* — C and W* = W in this case, but do not have to do so. On the other hand, if one does 
not discretize the inner products (/, 4>k) so carefully, then the approximation error might be substantially 
worse than that in (|3.6[) , as shown in the case of the sphere in [53]. The eignets G(C*; W*; P) with these 
choices of P have the advantage of stability as described in Theorem 13.21 below. 

Next, we wish to consider the question whether the estimate (|3.3[) is the best possible for individual 
functions, and whether the method of approximation described is the best possible. We wish we could 
say that if there is any sequence s m £ V m of eignets with ||/ — s m || p = 0(m~ r ) then necessarily / £ W£. 
However, such a statement is not true even in the classical trigonometric case. For example, for any r > 0, 

oo . , 

^ sm HjX 

the function f(x) — y fci+r sa ti snes the condition that the uniform degree of approximation to / from 

k=l 

trigonometric polynomials of degree at most m is 0(m~ r ). However, there is a continuous function /i 

oo . , 

^ sin AjlC 

such that (A*) r f(x) = fi(x) + — 7 — is n °t continuous. In the classical trigonometric case, one needs 

k=l 

to enlarge the smoothness class to achieve such an equivalence. This is done via if-functionals. We now 
introduce this notion in the present context. Not to confuse the notation with the heat kernel or the 
corresponding operator, we will use the notation w for the if-functional, motivated by the equivalence of 
the if -functional and a modulus of smoothness in the trigonometric case. 
If / £ X p , r > is an integer, we define for 5 > 

^(pj/.^^^IIZ-Allp + ^IKAT/illp : /i £ W^}. (3.8) 

If 7 > 0, we choose an integer r > 7, and define the smoothness class H p to be the class of all / £ X p 
such that 

ll/IU, := sup ^PiM < 00. (3.9) 

56(0,1] 6 

It can be shown that different values of r > 7 give rise to the same smoothness class with equivalent 
norms (cf. [5]). We note that W p C Hf for every integer r > 1. The class turns out to be the right 
enlargement for characterization by approximation by eignets. 

First, however, we wish to state the following version of Theorem 13.11 in the case when the special 
polynomials are chosen in place of P in that theorem. A popular technique in learning theory is to obtain 
an approximation by minimizing a regularization functional. For example, the quantity w r (p] f, $) is such 
a functional. The following theorem shows that the operators G defined with these special polynomials 
satisfy, up to a constant multiple, a minimal regularization property. 

Theorem 3.2 Let 1 < p < 00, f £ X p , (3 > a/p', < r < - a/p', L > 0, C* , W* be as in 
Theorem \'3.1\ 

(a) With G L (f,x) = a(C*;W*;a L (f),x), x £ X, we have 

11/ - G L (f)\\ p + L- r \\(A*) r G L (f)\\ p < au r (p; / 1/L). (3.10) 
In particular, ||G L (/)|| p < c||/|| p . 

(b) Let C C X be a finite set satisfying \2.11\) , W = {w y } ye c be a 1/ ' L -regular set of quadrature weights 
on C of order 2AL. For Gl(C; W; /, x) = a(C*; W*; o\l(C; W; f),x), 16X, we have 

||G i (C;W;/)|| p <c|gK||/(y)|4 , (3.11) 

and 

||/ - G L (C; W; f)\\ oa + L- r \\(A*) r G L (C; W; f)^ < c{w r (oo; / 1/L) + i-' r ||/||oo}. (3.12) 
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We are now ready to state the equivalence theorem for the spaces V m described at the beginning of 
this section. We assume that for each m > 1, q(C m ) ~ 1/m, and there exists a set of 1/m-regular set 
W m of quadrature weights of order 2 Am based on the set C m . For 1 < p < oo and / 6 X p , let 

G m (f,x) := G(C m ; W m ; er m (/), x), a; G X, m = 1, 2, • • • . (3.13) 



We note that there is no conflict with the notation in Theorem 13.21 since we may choose C* = Cl, 
W*=W L . 

Theorem 3.3 Suppose that 

K t (x,x) > ct- a/2 , x e X, t e (0,1]. (3.14) 

Then the following are equivalent for each 7 with < 7 < (3 — a/p' : 

(a) ft HP. 

(b) sup m >! m~*\\f - G m (/)|| p < c(f). 

(c) su Pm > 1 m< dist (LP; f, G(C m )) < c(f). 

In the case when p — 00, each of these assertions is also equivalent to 

(d) sup m > 1 mT||/-G(C m ;W wl ;o- m (C rn ;W Tn ;/))||oo <<f). 

Thus, if one considers the class HP in place of Wfi, then the estimates of the form given in Theorem 13. 31 
(b) (or (d)) are best possible for individual functions. One may also formulate a similar equivalence 
theorem for Besov spaces, defined by replacing the supremum expression in (|3.9p by a suitable integral 
expression. However, this would only complicate our notations rather than adding any new insight into 
the subject. Therefore, we prefer not to do so. We note that in the case when 0j's (respectively tj's) 
are the eigenfunctions (respectively, eigenvalues) of the negative square root of the Laplace-Beltrami 
operator, then Minakshisundaram and Plcijcl have proved an asymptotic expression for the heat kernel 
in [25], which implies both (|3.14[) and (|2.4[) . In [T5], Hormander has obtained uniform asymptotics for 
the sums J^e <l^( x ) f° r a very general class of elliptic differential operators on a manifold. It will be 
shown in Lemma 15.21 that these lead to (|3.14[) and (|2.4[) (with x — y). Further examples are given by 
Grigoryan [16] and references therein. 

We end this section by recording two interesting facts, valid for arbitrary eignets of type (3. The first 
of these facts relates the coefficients of the eignet with its norm. For a sequence (or vector) of complex 
numbers a = {aj} and 1 < p < 00, we denote by ||a||fp, the usual sequential (or Euclidean) £ p norm. 

Theorem 3.4 We assume that {3.1$ holds. Let 1 < p < 00, (3 > a/p' . C C X be a finite set, a y £ R, 
y G C, and a = (a y ) yg c- Then 



\a\\tp < cq(C) 



a/p'-/3 



^2a y G(o,y) 



(3.15) 



The second fact describes the simultaneous approximation property of eignets. 



Theorem 3.5 We assume that {3.1$ holds. Let 1 < p < 00, 0<7</3 - a/p 1 , < 7 < r < f3, 
and f S W r P. IfV m e V m satisfy \\f - * m \\ p < cm~ r \\ (A*) r f\\ P then also ||(A*)T/ - (A*)^ m \\ p < 
cmf- r \\(A*Yf\\ p . 



4 An abstraction 

In our proofs, we need to estimate many sums and integrals. Since these estimates involve similar ideas, 
we prefer to deal with them in a unifed manner by treating sums as integrals with respect to finitely 
supported measures. We observe that if C C X, and W x , x £ C, are any real numbers, a sum of the 
form X^ksC W x f(x) can be expressed as a Lebesgue-Stieltjes integral J fdv, where v is the measure that 
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associates the mass W x with each point x 6 C. The total variation measure in this case is given by 
IH(-S) — J2xeBnc l^a:!) -B C X. Thus, for example, in (|3.5p . if i> is the measure that associates with each 
y £ C the mass w a £ W, then we may write 

«r£(i/;/ >a :):= / f(y) T h{i k / L)<j> k (y)<t> k {x)dv( y ) (4.1) 

in place of the more cumbersome notation o\l(C; W; /, x), helping us thereby to focus our attention on 
the essential aspects of this measure rather than the choice of C and W. Moreover, if one takes fj, in place 
of is, then ol(/i;/) = <tl(/)- I n addition to being concise, this notation has another major advantage. 
If the information available about the target function / is neither the spectral data {(/, 4>k)} nor point 
evaluations, but, for example, averages of / over small balls, the notation allows one to treat this case 
as well without introducing yet another notation, just by defining v appropriately. In the sequel, with 
the exception of a few occasions, we will typically use v to be one of the following measures: (1) fi, (2) 
the measure that associates the mass w y with each y £ C for some C, (3) the measure that associates the 
mass q(C) a with each y £ C, and (4) various linear combinations of the above measures. 

To demonstrate a technical advantage, Definition [2J] takes the following form, where the ambiguity 
and tacit understanding about what the constants depend upon can be avoided, and we get the full 
advantage of the vector space properties of measures. 

Definition 4.1 Let d > 0. A signed measure v defined on X will be called d-regular if there exists a 
constant c — c(y) > such that 

\u\(B(x,r)) < c{^(B(x,r)) + d a } , x £ X, r > 0, (4.2) 

where a is the constant introduced in \2.1\) . Let Add denote the class of all signed measures satisfying 
4--<fy - Then Add is a vector space. For v £ Add, if we denote by \\v\\Md the infimum of c which serves in 
4-%fy, then || o \\m a is a norm on Add- 



For example, \i itself is in Add with ||/z|| M d = 1 for every d > 0. If C C X is as in Theorem l2.1[ then we will 
show in Lemma T5.3I below that the measures that associate the mass [i(B(x, 8(C))) (respectively, S(C) a , 
q(C) a , w x , \w x \) with x £ C are all in Ads(c) as we U as -^g(C) with ||^||A^ g(C) < c , where the constant is 
independent of C. It is also easy to see that for any c > 0, Add C Ad c d, with |M|.M cd < max(l, c )!!^!)^. 
In view of (|2.ip and (|2.7p . the condition (|4.2[) is equivalent to 

\u\(B(x,r)) < c\\u\\ Md (r + d) a < ci \\u\\ Mi l*( B fa r + d ))- ( 4 - 3 ) 

Finally, we note that since /x is a probability measure, the condition (|4.2[) implies that I^K-B) < c(l + d a ) 
for every ball BcX, and hence, that H(X) < c(l + d a ) as well. 
The quadrature formula (|2.14[) can be restated in the form 

j P{yW{y) = J P(y)du(y), P £ U L , (4.4) 

where v is the measure that associates the mass w y with each y £ C. Any (signed or positive) measure 
v satisfying (|4.4| will be called a quadrature measure of order L; in particular, /j, itself is a quadrature 
measure of order L for every L > 0. 

If v is a signed or positive Borel measure on X, X C X is ^-measurable, and / : X — > C is a 
^-measurable function, we will write 

\f{x)\*d\v\{x)\ , ifl<p<oo, 



\v;X,p •- \ [J x 

\u\ -esssup aex |/(a;)|, if p = oo. 

We will write L p {y\X) to denote the class of all ^-measurable functions / for which ||/||i/ ; x,p < °°- 
where two functions are considered equal if they are equal |^|-almost everywhere. To make the notation 
consistent with the one introduced before, we will omit the mention of v if v = fi and that of X if X = X. 
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In the sequel, for any H : R — > R, we define formally 

oo 

$ L (H;x,y) :=Y,H{t j /L) ( t> j {x) ( t> j {y\ x,y £ X, L > 0. (4.5) 
i=o 

For example, G(x, y) = < &l(6l; x, y). \i v is any measure on X and / £ L p , we may define formally 

a L (H; v- /, ar) := J f(y)$ L (H; x, y)dv{y). (4.6) 

As before, we will omit the mention of v if v = fj, and that of H if H = h. Thus, $l(x, y) — $l(^; x, y), 
and similarly ax(/, x) = ox(/i; /x; /), <tl(i/;/, x) — o~l (h; v\ f, x) . The slight inconsistency is resolved by 
the fact that we use fj,, v, v etc. to denote measures, h, g, b, H, etc. to denote functions, and X, X to 
denote sets. We do not consider this to be a sufficiently important issue to complicate our notations. We 
note that a L {G{o, y),x) = <&l(/i&l; x, y). 

In the sequel, we define g by g(t) — h(t) —h(2t). We note that g is supported on (1/4, 1)U(— 1, —1/4), 
and 

k \¥) =^ + Es(^)> t€R, n=l,2,.-.. (4.7) 



5 Technical preparation 

In Section [5.11 we prove a few facts regarding the kernels which will be used very often in the proofs 
in Section [S] as well as the rest of the proofs in this section. In Section IST21 we describe several properties 
of diffusion polynomials and approximation by these. Since we do not need all the assumptions listed in 
Section [2.11 we will list in each theorem only those assumptions which are needed there. 



5.1 Kernels 

We will often use the following simple application of the Riesz-Thorin interpolation theorem [5, Theo- 
rem 1.1.1] to estimate the operators defined in terms of kernels. 

Lemma 5.1 Let V\, be signed measures (having bounded variation) on a measure space VL, supported 
on Qi and 0,2 respectively, $:Oxf!^ltea bounded, \v\\ x \v2\ measurable function, 1 < p < 00, 
f £ and let 

T f (x) := / f{t)${x,t)d Vl {t). 



I|i>i|;fi,l, 



Then with 

A 1 = sup \\$(;t)\\ i^ijn.i, ^00 = sup ||$(x, 
we have 

\\Tf\\ M ap<4 /P AlP\\f\\Mfl<p- (5-1) 

Proof. It is clear that ||27l||t/ 2 |;n,oo < ^oo||/|||i/i|;n,oo- Fubini's theorem can be used to see that 
1 1 2~y || 1 1 ; o , 1 < ■^■i||/||ii/ 1 i;n,i- The estimate (|5.ip follows by Riesz-Thorin interpolation theorem. □ 

The starting point of our proofs is to recall the following theorem proved in 25], and in [14] in 
somewhat greater generality, stating the assumptions as they are stated in this paper. 

Theorem 5.1 Let S > a be an integer, H : K — * M be an even, S times continuously differ •entiable 
function, supported on [— 1, 1]. We assume further that {2.1$ hold. Then for every x, y £ X, L > 0, 
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Consequently, 

sup f \<f> L (H;x,y)\dn(y) < c\\\H\\\ s , (5.3) 
and for every 1 < p < oo and / G L p , 

W{H;f)\\ p <c\\\H\\\ s \\f\\ p . (5-4) 



The following Propositions 15.11 and 15.21 will be used very often in this section, with different interpre- 
tations for H and the measures involved. 

Proposition 5.1 Let d > 0, S, H be as in Theorem \5.1[ and i2.1\) , \2.1$ hold. Let v £ Aid, L > 0, and 

c be the constant that appears in \2.1\i . Let 1 < p < oo, 1/p' + 1/p = 1. 

(a) If gi : [0, oo) — » [0, oo) is a nonincreasing function, then for any L > 0, r > 0, x S X, 

L" I g x {Lp{x,y))d\v\{y) < 2 " (c + ( ff " )a \\v\\ Md f°° g 1 (u)u a - 1 du. (5.5) 



1 o — a 

A(s,r) 1 z JrL/2 

(b) I/r> 1/L, i/ien 

\$ L (H;x,y)\d\v\(y) <c 1 (l + (dL) a )(rL)- s+a \\u\\ M AW\\\s- (5-6) 



/A(x,r) 

(c) We have 

r \$> L {H;x,y)\d\v\{y) < c 2 {(l + (dLr)}\\u\\ Md \\\H\\\ s , (5.7) 



||0> i (i/;x,o)t ;XiP <c 3 L Q /p'{(l + (di)")} 1 ^|| i ,|| A , d ||| J ff||| s . (5.8) 

Proof. By replacing v by |^|/||^||m<j! we mav assume that v is positive, and |MI.M d = 1- With a 
similar normalization with H, we may also assume that |||-ff|||s = 1. Moreover, for r > 0, v(B(x,r)) < 
fi(B(x 1 r)) + d a < (c + (d/r) a )r a , where c is the constant appearing in (|2.ip . In this proof only, we will 
write .A(x, t) = {y E X : t < p(x, y) < 2t}. We note that v{A{x, t)) <2 a {c+ (d/r) a )t a , t > r, and 

Since g\ is nonincreasing, we have 

r. OO „ 

/ 9i(Lp(x,y))dv(y) = / g 1 {Lp{x,y))dv{y) 

JA(x,r) R=Q JA(x,2 R r) 

oo 00 

< ffi(2*rL)i/(.A(a;, 2 fl r)) < 2 Q (c + (d/r) Q ) ^ ffl (2 fl rL)(2 fl r) Q 

fl=0 i?,=0 

£ gx{urL)u a ~ l du = 2 °( c + j°° 9l {urL)u^du 



< 2«(c+(d/r)> ra f ^ _ ,...n..a-u._ 2 a (c+(d/r) a )«_a 



1 - 2~ Q 



1 - 2- Q 



rL/2 



This proves (|5.5p . 

Let x S X, L > 0. For r > 1/L, d/r < di. In view of (|5.2|) and (|5.5p . we have for x <E 



/ |$£(fi-;a?,y)|di/(i/) < Cl L Q / (Lp(x,y))- S dis(y) < Cl (c + (dL) a ) [ v^+^dv 

JA(x,r) JA(x,r) JrL/2 



< C2 (l + (dL) a )(rL) 



-S+a 
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This proves (|5.6|) . 

Using (|5.6|) with r = 1/L, we obtain that 



\$ L (H;x,y)\dv(y)<c 2 (l + (dL) a ). 



(5.9) 



/A(»,l/i) 

We observe that in view of ([B~2J) . and the fact that i/(B(x, < Ci(l/Z + d) a < c x L- a (\ + (dL) a ), 

[ \$ L (H; x, y)\dv{y) < c x L a v{B{x, 1/L)) <a(l + (dL) a ). 

JB{x,l/L) 

Together with (jiHi)) , this leads to ([577]) . 

The estimate (15. 8[) follows from (|5.2p in the case p — oo, and from (|5.7[) in the case p = 1. For 
1 < p < oo, it follows from the convexity inequality 



IfT 



< ll^ll^ooll^ll^r 



(5.10) 
□ 



Corollary 5.1 Let (3 G R, 6 &e a mask of type (3, n > 1 6e an integer, v G .A4 2 -™, arlc ^ (E2P; ftoid. 
TTien /or integer n > 1, 



2-^, z//3<0 ; 
sup / |$ 2 »(/i&2»;a;,y)MM(y) < c||^||x 2 _ n ^ n, if = 0, 
xexJx ~ I 1, i//3>0 ; 



(5.11) 



and /or 1 < p < oo , 



2—n(fi—afp') if (3 < a/p' , 
11*2" (/<'-■ :.)||,, ' r|>,[ A(j ^ „, if = a/p'', 



(5.12) 



i//3 > a/p'. 



PROOF. We normalize ^ so that |MIa-i 2 _„ = 1- In view of (|4.7[) . 



3=0 



fc=li=0 
n 

£j<1 fc=l 

Since /i and 6 are both bounded functions, (|2.6[) shows that 



(5.13) 



< c, x, y£ 



(5.14) 



In view of (|2.19|) used with b in place of 6, and (15. 7|) used with d = 2 L = 2 fe , iJ = <?o 2 fc, we obtain 
sup/ |$ 2 *,(sf6 2 *,;x,y)|rf|i/|(y) <c3r kp , fc = l,2,---,n. 

x6X Jx 



Together with (|5.13[) and (|5.14[) . this leads to (|5.11[) . The proof of (|5.12[) is similar; we use (|5.8[) in place 
of □ 

We observe that if C is a finite subset of X, ^ is the measure that associates the mass q(C) a with 
each y G C, then an eignet Vf^x) = X^eC a yG(x, y) can be expressed as q{C)~ a j- x a{y)G(x,y)dv(y), and 
<tl(^', x) = g(C)~ Q L a(y)$ L{hb l] x , y)dv(y). One of the applications of the following proposition is then 
to estimate ||$ — ox(^)|| p . A different application is given in Lemma RTT1 
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Proposition 5.2 Let 1 < p < oo, f3 > a/p' , b be a mask of type (3, and h2.1\) . \2.J$ hold. 

(a) For every y € X, there exists %p y := G(o, y) e X p such that {ip y , <pk) = b(£k)4>k(y), fc = 0, 1, - • •. We 
have 

sup \\G(o,y)\\ p <c. (5.15) 

yex 

(b) Let n > 1 be an integer, v £ .M 2 -n ; and for F S L l (v) n L°°{v), m > n, 

U m {F,x) := / {G(x,y) - $ 2m (/i6 2m ; :r, 



Then 



\\U m (F)\\ p < c2~ m ^ m ~ n yp'M M \\F\l^ p . 



Proof. 

Since /i G A4d and ||/x||A<t d = 1 for every d > 0, we conclude from (|2.19p and 
place of v , 1/L in place oi d, H = g&z,), that 



(5.16) 
(used with /i in 



sup||<f> i ( 5 6 i ;y,o)|| p < c L«/p'-/ 3 , 
Since (3 > a/p', we conclude for integers 1 < n < N, 



L > 1. 



sup 

yex 



N 



^2 ^2o(gb2i;y,°) 

j=n+l 



N 



< V sup||<i> 23 ( 5 6 2J ; 2 /,o)|| p < c 2"( Q /p'-«. 

yex 



(5.17) 



J= n+1 ' 



(5.18) 



Thus, the sequence 

n 

$1 {hb x ;y,o) + ^2 (fl*2J ; V, °) = $2" (W>2™ ; 2/, °) 
j=l 

converges in L p to some function in X p , uniformly in y. Denoting this function by ifiy, it is easy to 
calculate that (V>j,,0/c) = b(£k)4>k(y)- Thus, the formal expansion of ip v is the same as that of G{p,y). 
Moreover, 

o- 2 n(ip y ,x) = a 2 ™(G(x,o),y) = $2™(hb 2 ™;y,x) 

converges to G(x, y) in the sense of LP in x, and uniformly in y. The estimate (|5 . 1 5[) is clear from (|5.17l) 
and (j5~Tgj) . 

To prove part (b), we use a similar argument again. Without loss of generality, we may assume that 
v is a positive measure and ||f || j m,_„ = 1- Let j > n be an integer. Using (|2.19[) . (|5.7[) with 2~™ for d. 
2 J in place of L, and oberving that dL > 1 with these choices, we obtain 



(5.19) 



sup / \<S> 23 {gb 2] ;x,y)\dv(y)<c2- na 2-^- a \ 
Using (|2.19[) . (|5.7[) with \x in place of v, 2 J in place of L, and 2 _J for d, we obtain 

sup / \§ 2 j{gb 2 i;x,y)\diJ,{y)<c2- j P. 
Hence, Lemma l5.1l with v in place of v\, fi in place of v 2 , implies that 



$v(gb 2 r,o,y)F(y)dv(y) 



(5.20) 



Since (3 > a/p', the sequence 

f $ 2 n(hb 2 n;o,y)F(y)dv(y) = f $ x (/i6; o, y)F(y)du(y) + V / ^ 2i {gb 2 r,o,y)F{y)dv{y) 
Jx Jx Jx 
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converges in the sense of LP to some function in X p . Since $2« (W>2« ; °> y) — * G(o,y) in the sense of LP 
uniformly in y, this function must be f x G(o,y)F(y)dv(y). Consequently, 



j—m+l 

in the sense of L p , and (|5.20[) implies that 



OO p 

U m {F,o)= V ®v{gb 2j ;o,y)F{y)dv{y) 
.• Zr, 1 



oo „ 

^ m (^)|| P < V ®2i(9bv;o, y )F(y)dv(y) 
■ , , jx 



j— m+l 



< c 2- na / p ' ^ 2-^ /3 - a / p ')||^|| 1/;X , P < c2- ,n,3 2 aim - n ^ p '\\F\\^ p . 

j=m-\-l 

We pause in our discussion to show that (|3 . 14[) implies a lower bound on the sum J"^ <L 4> 2 j(x) 



□ 



Lemma 5.2 Let C > 0, {aj} &e a sequence of nonnegative numbers such that X)fco ex p(~^) a j con- 
verges for t £ (0, 1] . TTien 

cii c < ^ < c 2 L C , L >0, (5.21) 

lj<L 

if and only if 

oo 

c 3 t- c/2 < ^ exp(-^t)oj < c 4 t~ c/2 , t e (0, 1]. (5.22) 

3=0 

In particular, ^3.14^ and \2.$ imply that 

ciL a < <t> 2 j(x) < c 2 L a , xeX, L > 1. (5.23) 

ij<L 

Proof. The fact that the upper bound in (|5.22p is equivalent to the upper bound in (|5.21[) is proved in 
[HI Proposition 4.1]. In this proof only, let s(u) = Y^e < u a r Then 



oo „ c 

Vexp(-£^) aj = / 

„_n JO 



Since the sum converges, it is not difficult to verify by integration by parts that 



3=0 

If (|5.2ip holds, then s(u) > cu c for u > 0, and 



°° />oo 

V exp(-e 2 t) aj = 2t / we~ u2 *s(M)du. (5.24) 
Jo 



2t / ue"" *s(u)dw > 2ci / u c+1 e~ u l du = cr c/2 / v c/2 e - v dv = Cl t- C/ 



Thus, the lower bound in (|5.21[) implies the lower bound in (|5.22p . 

In the remainder of this proof, it is convenient to let the constants retain their value, which might be 
different from what they were in the above part of the proof. Let both the upper and lower inequalities 
in (|5.22p hold. Then the upper bound in (|5.21D holds also. We observe by integration by parts that for 
any L > 0, L?t > C, 

J^u c ^e-^du = ^^eM~L 2 t) + ^J\ c -^ 2t du 

< ^£ eM-Lh) + JjL jT u^e-^du; 



1G 



i.e., 

-l 



2t J°° u c+1 e- u2t du <(l- {Lhf 2 eM-L 2 t)t- c '/ 2 . 



Thus, there exists C5 such that 

It \ u c+1 e~ u2t du < ^t~ c '\ L 2 t > c 5 . 
Jl " 2c 2 ' 

We conclude from the lower bound in (|5.22[) . (|5.24p , and the upper bound in (|5.2ip . that for t, L > 0, 

LH >c 5 , 



c 3 t~ c/2 < 2t ue~ ut s(u)du = 2t ue~ u *s(u)du + 2t I ue" u t s[u)du 
Jo Jo Jl 

/•L />oo 

< 2ts(L) / ue' u2t du + 2c 2 t / u C+l e~ uH du 

Jo Jl 

< s(L)(l -exp(-L 2 i)) +c 3 i- c / 2 /2. 

Taking i = c^L~ 2 , we obtain from here that s(L) > cqL c . □ 

In the remainder of this paper, we adopt the following notation. Let k* > max(2, (1/a) log 2 (2c2/ci)) 
be a fixed integer, where ci, c 2 are the constants in (|5.23p . Then for x G X, 

^ 2 (x)<c 2 2- ak 'L a <( Cl /2)L a , 

ij<2- k " L 

and hence, (|3.14| implies that 

J2 <t?j{x) > (ci/2)L a . (5.25) 

2- k * L<lj<L 

We further introduce g(t) := fr(t) - h(2 k *+H). Then gr(t) > for all t G M, gr(i) = if < t < 2- k *- 2 or 
t > 1, and g(t) = 1 if 2~ fe *-i < t < 1/2. We note that 

HIsMIs < cL _/5 , i>l. (5-26) 
The following lemma will be needed in the proof of Theorem 13.41 



Lemma 5.3 Suppose that \3. 1J$ holds. Let C C X be a finite set, q = q(C) < 1, and v be a measure that 
associates the mass q a with each x G C. Let \2.1]) , 1(2.3]) , and {2.1$ hold. Then v G M. q , and \\v\\M q < c , 
the constant being independent of q. Next, we assume in addition that \3. 1J$ holds. Then for every 
integer m with 2 m > q^ 1 , 

\<t>2^(~gb 2m ;x iy )\<c(q2 m r S+a 2 mia ~ P \ xeC, (5.27) 

yeC, xjty 

and 

<5> 2 ™(gb 2 ™;x,x)>c2 m ( a -f 3 \ lei (5.28) 
In particular, there exists C\ > such that for 2 m q > c\, 

\^(9b 2 ^,x,y)\<(l/2)\^(gb 2m ;x,x)\, xeC. (5.29) 

yGC, x^y 

Proof. If xo G X, r > and B(xo,r) n C = {yi, ■ ■ ■ ,yj}, then the balls B(yj,q/2) are disjoint, and 
U- J ' =1 B(y j ,q/2) C B(x ,r + q/2). Using the fact that u(B(x ,r)) = q a J, and recalling ([2~7]) . we obtain 

J 

fi(B(x ,r + q/2)) > M (u/ =1 B( % -,<z/2)) = ^ M (S( yj , g/2)) > cJg Q = ^(a*, r)). 
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In turn, ()4.3|) now implies that v G M. q , and ||i>||,m. < c. Since every point y £ C with y 7^ a; is in A (a;, g), 
(|5.2GD and (15. 6p , used with 9 in place of r and d, 2 m in place of L, imply that 

q a \^(gb 2m ;x,y)\ < c{q2 m )- s+2a 2- ml3 = cq a {q2 m )- s+a 2 m{a - f3 \ 

yeC, xjty 

This proves ([PT|) . 

We recall that g(t) = 1 if 2" fc *" 1 < t < 1/2 and &(£,) > for ^ > c. Consequently, ([OB"]) implies 
that for any to > c, and 16I, 

$ 2m (g6 2m ;x, a; )= ^ g(£ j /2 m )b(£ j )tf(x)>c2- m P £ ^ 2 (x) > c2 m(a-/J) - 

2m-*'-2<f 3 .<2»> 2 m -!- fc * <£ j <2'"- 1 

This proves (|5.28p . Recalling that 5 > a, we may choose to to make 2 m q large enough, yet ~ 1, so that 
l|5~27|) and (j5T2"gj) lead to 1(5351) . □ 



5.2 Diffusion polynomials 

In this section, we summarize various properties of the diffusion polynomials, and approximation by 
these. The first statement is only a simple corollary of Theorem 15.11 

Corollary 5.2 Let 1 < p < oo, d > 0, H , and the other conditions be as in Theorem \5.1[ and v G Md- 
Then for any L > and P G Hl , 

||a4if; M ;/)||, ;X , p <c(l + (di)^||H|^lll ff lllsll/IU ( 5 - 30 ) 

lkz(H;i/;/)[[ p < C (l + (dL)«)Vp'|| v ||W||| H [|| g [| / [[ |/ . Xij( . (5 . 31) 
In particular, if P € IT/, then 

\\P\\ v; x, P <c(l + (dLr)y^\\]^J\P\\^ (5.32) 

Proof. The estimates (1530)) and (jOTj) follow from Lemma I5~TI (jO]) , and (|5~7)) . Let P G IT L . Then 
o~2L(h; fJb] P) = P. We use (|5.30|) with 2L in place of L, /i in place of P, and P in place of / to deduce 
(15321 . □ 

The next lemma states some estimates for different pseudo-derivatives of diffusion polynomials. 

Lemma 5.4 Let (3 > 7 > 0, L > 0, P G II L , and 00]) , ggp ZioZd. 

(a) Por any r > 0, 

\\(A*YP\\ P <cL r \\P\\ v . (5.33) 

(b) If G is a kernel of type (3, and T>q is the operator defined in Definition \3.1\ then 

\\V G P\\ p <cL^\\(A*rP\\ p . (5.34) 

Proof. Part (a) is proved in [25]. We will prove part (b). In this proof only, let n > 1 be an integer such 
that L < 2 n - 1 . In this proof only, let b y (t) = (1 + |i|) 7 &(0, t G R. Then b' 1 is a mask of type 7 - /? < 0. 
For ieX, we have 



yJM ((a*)tp,^-) 

* a » (ty6 7>2 »; z, y)(A*yP(y)dn(y). (5.35) 
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We deduce (|5.34j) using (|5.11|) with b 7 — j3 < in place of (3, and Lemma [57X1 with v\ = v% = \i. 



□ 



Even though a product of two diffusion polynomials is not necessarily a diffusion polynomial, the 
"product assumption" allows us to estimate the error in discretizing an integral of the product of such 
polynomials using a quadrature measure. This is summarized in the next lemma. 



Lemma 5.5 Let L > 0, and \2.1\) , \2.J$ hold. For any p,r, 1 < p < r < 00 and P G Ul, 

\\P\\ r < cL a W p - 1 M\\P\\ p . (5.36) 

We assume further that the product assumption holds. If u is a quadrature measure of order AL, H(X) < 
c, and Pi, P2 S Ul then for any p,r, 1 < p, r < 00 and any positive number R > 0, 



P x P 2 dn- / P x P 2 dv 



< c 1 Z 2Q e i ||P 1 || p ||P 2 || r < c(i?)L-«||P 1 || p ||P 2 | 



(5.37) 



PROOF. Since 



P(x) = / P(y)$ 2L (x,y)dti(y), 



(|5.2[) implies that ||P||oo < c£ Q ||-P||i- Therefore, the convexity inequality (cf. (|5.10p ) implies that 
||P||oo < cL a /P\\P\\ p . If r < 00, then 

IIPli; = / \p(x)\ r d»(x) < ||p|CTp||* < cL a <r/*-v\\p\\;. 



This proves (|5.36p . 

Next, we assume that the product assumption holds. Let P x = J2e <L a j'^ji Pi = Ef^t^^i 
and Qj t k £ Hal be found so that \\<j>j<j>k — Qj,k\\oo < 2 dist (00; (f>j4>k, Hal) < %£l- Then, with Q := 
fe a jdkQj,k, we have for every x 6 X, 



\Pi(x)P 2 (x) - Q(x)\ 



^ajd k {4>j{x)(j) k (x) - Qj jk (x)) 

3,k 



<2c L ^;|o i ||d fc |. 



(5.38) 



In view of (|2.6|) , 



m ■ 



< L}| = £ / <^(^M*) < cL a . 



Therefore, we conclude using (|5.36|) and (|5.38[) that 

IIPP2 - Qlloc < 2e L J] I^IMfel < cL Q £L||a||,2|jd||,2 = cL a e L \\Pi || 2 ||P 2 1| 2 < cL 2Q e L ||P 1 || p ||P 2 || I .. 

Recalling that M(X) < c, and J x Qd[i = J x Qdv : we deduce that 



P 1 (x)P 2 {x)dfj,(x) - / P 1 (x)P 2 (x)dv(x) 



(P 1 (x)P 2 (x) - Q(x))d»(x) - / (P!(x)P 2 (x) - Q{x))dv{x) 
< c\\P 1 P 2 - QIU < cL 2a e L \\Pi\\ P \\P 2 \\ r . 
The product assumption implies that L 2a+R £L < c, leading thereby to (|5.37j) . 



□ 



Next, we prove a result regarding approximation by diffusion polynomials. Part (a) of this result is 
essentially proved in [25] ; we prove it again for the sake of completeness. 
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Proposition 5.3 For 1 < p < oo, f e X?, L > 0, r > 0, and 121}) . \2J$ hold. 

(a) We have 

||/ - <j L (f)\\ P + L- r ||(A*)V L (/)|| p < cu r ( P ; f, l/L). (5.39) 
In particular, if f S Wf! , then 

dist(p;f,U L ) < \\f-a L (f)\\ p <cL- r \\(A*Yf\\ P - (5.40) 

(b) If f eWP, Pe IL L satisfies \\f - P\\ p < e, then 

\\(A*yf-(A*yP\\ P <c{L r e+ dist(p;(A*) r f,Il L/2 )}. (5.41) 
In particular, ||(A*) r P|| p < c(L r e + \\(A*) r f\\ p ). 

(c) We assume in addition that the product assumption holds. Let v be a 1/ L-regular quadrature measure 
of order AL. For any f E , 

11/ - o L {v- f)U < C L- r {||/|U + IKAT/lloc}. (5.42) 

If f £ X°°, then 

11/ - o L (v- /)!!«, + L- r \\(A*) r o-L(v, f)\\oo < c{u Jr (^;f,L- 1 ) + ll/IU}. (5.43) 

Proof. First, we prove (|5.40p . This proof is the same as that of [551 (6-4)]. Thus, let J be the greatest 
integer with 2 J < L. In this proof only, let gj(t) = g(t)/(2^ : ' + \t\) r , t 6l. Recalling that g is supported 
on [1/4, 1] U [-1, -1/4], we see that HI^IHs < c. Hence, implies that 

||a 2 ,( 5 ;/)|| p = 2-^||cT 23 -( 5i ;(A*r/)|| P <c2-^||(A*r/|| P - 

Hence, 

oo 

dist( P; /,n L ) < dist ( P; /,n 2 .) < \\f-cr 2 j(f)\\ p < £ \\* 2 i (91 f)\\ P 

j=J+l 

< e2- Jr \\{A*Yf\\ p <cL- r \\{A*Yf\\ P . 

If P 6 11^/2 is chosen so that ||/ — P\\ p < 2 dist (p; f, 14l/ 2 ), then (|5 .4(1 implies that 

11/ - <TL(f)\\ P = ||/ -P-a L (f- P)\\ p < c\\f -P\\ p <c dist (p; /,H L/2 ) < C L-H|(A*) r /|| p . 

This proves (|5.40|) . In particular, we note that if Q 6 IT £ / 2 is chosen so that ||(A*) r (/ — Q)\\ p < 
2 dist (p;(A*Yf,n L/2 ), then 

11/ - a L (f)\\ p = \\f-Q- v L {f - Q)\\ p < cL- r \\(A*) r (f ~ Q)\\ P < cL- r dist (p; {A*f f ,W L/2 ). (5.44) 

Next, let /i be chosen so that ||/ - /i || p + L- r \\ (A*) r /i Hp < 2uj r {p; f ,1/ L). Then using and (j533"l) . 
we deduce that 

||/-<7 i (/)|| p + L- r ||(A*)Vi(/)L 

< 11/ - fx - o L {f - h)\\ P + ||/! - a L (fi)\\ P + L- r {\\(A*) r a L (f h)\\ P + ||(A*)Vl(/i)|| p ) 

< c{||/ - AH, + L-1(AT/i||p + IK(/ - h)\\ P + L- r \\a L ((A*) r fi)\\ P } 

< C {||/-/i||p + L-1|(A*r/iW<cc r (p;/,l/L). 

This proves ((5T39|) . 

Next, we prove part (b). In view of (|5.33|) . (jO)) , and (|5.44[) . 

ii(A*r^-(A*r/iip < n(A*r(p-^(/))iip+ii(AT(/-^(/))ii P 

= ||(A*r(P - <r L (/))|| p + ||(AT(/) - ^((A*r(/))|| p 

< cL r \\P-a L (f)\\ p + Cl dist (p;(A*) r f,n L/2 ) 

< cL r \\P - f\\ p + cL r \\f - a L {f)\\ p + ci dist (p;(A*r/,n i/2 ) 

< d7e + Cl dist (p;(A*) r /,n z/2 ). 
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This proves part (b). 

To prove part (c) , let P G 11^ / 2 be arbitrary. Since 



P(x) = / P(y)$ L (x,y)d»(y), xeX, 
Jx 

we obtain from (|5.37[) (with r in place of R) and (|5.3p that for every x G X, 

P(y)<$> L (x,y)dn{y)- / P{y)<$> L {x,y)dv(y) 



\P(x)-a L {v;P,x)\ = 

Jx Jx 
< c 1 £- r ||P|| 00 ||$ i (a ; ,o)|| 1 < cL- r |l-P||oo. (5.45) 

Hence, if / G VK r °°, 

\\f - °l{vJ)\\oo < ||/-tr i/a (/)||oo + K(i/;/- < 7 Z/2 (/))[[oo 

+ lki/2(/) _ cr i(^; cr L/2(/))l|oo 

< c{||/-^ /a (/)|| oo + Zi-'-||(T V2 (/)||o } 

< Cj L-^{|[(A*r/lloo + ll/lloo}- (5.46) 
This proves (pT32]) . Next, let / G X°°, and 

IIZ-Zilloo + i-HKATAlU <2 Wr (oo;/,l/L). 
Then using (|5.31|) and (I5.46| (with /i in place of /), we obtain 

||/-^;/) Hoc < ||/-/i||oo + IMi/;/-/i)|| o + ||/i-^(i/;/i)||oo 

< c{\\f h\\oo + L-HKATAlloo + L-'ll/illoo} 

< cKCoo;/,^- 1 ) +^11/11^}. (5.47) 
Applying (|5.46|) with /i in place of /, and using part (b) of this proposition, we see that 

||(A*)'7l - (A*)W;/l)l|oo < C {||(AT/l||oo + H/llloo + ||(A*) r /i||oo} 

< c{||/-/ 1 || 00 + ||(A*)'7l||oo + ||/||c }. 

Hence, using (|5.33p and the uniform boundedness of the operators ol(^), we obtain 

||(A*)Vl(^;/)IIoo < ||(A*)Vl(^;/-/i)I|oo + ||(A*)7i - (A*)V^;/i)Hoc + IKAT/ilU 

< c{U\\a L {v- f - / 1 )|| o0 + ||/ - MU + ||(A*r/i||oo + ll/lloo} 

< c{V\\f- AHoo + || (ATA Hoc + ll/lloo} 

< cL r K(oo;/,l/ J L)+L-'-||/|| 00 }. 

The estimate (|5.43[) follows from this estimate and (|5.47[) . □ 



6 Proofs of the main results 

In this section, we assume all the assumptions made in Section |2~TI namely, that (|2.1[) . (|2.3p . (|2.4p , (|2.5p , 
and the product assumption hold. We start with the proof of Theorem 13.11 Let W* = {w*} ye c*, an d 
v* be the measure that associates with each y G C the mass w*. As explained in Section |4j the eignet 
G(C*; W*; P) can be written more concisely as 

&(C*;W*;P 7 x) =: G(u* ; P, x) := G(G; i/ 1 ; P, x) = / {V G P){y)G{x,y)dv* (y), x G X. 

The condition that W* is a 1/i-regular set of quadrature weights of order 2AL corresponding to C* can 
be stated more concisely in the form that v* G Mi/l, W^Wmi/l — c i an< ^ v * ^ s a quadrature measure of 
order 2AL. Theorem 13.11 then takes the form of the following Theorem 16. II 
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Theorem 6.1 Let L > 0, v* £ Mi/l> \W*\\m 1/l < c , an d v* be a quadrature measure of order 2AL. 
Let 1 < p < oo, (3 > a/p', < r < (3, f eWf. Let P eU L satisfy \\f - P\\ p < cL- r ||(A*) r /|| p . Then 



\\f-G(v*;P)\\ p <cL- r \\(ATf\\ P - 



(6.1) 



The following lemma summarizes some of the major details of the proof of this theorem, so as to be 
applicable in the proof of some of the other results in Section [3] 

Lemma 6.1 Let n > 1 be an integer, v £ A4 2 - n , IMI»,_„ < c. Let 1 < p < oo. (3 > a/p' , < r < (3, 
P £ n 2 n . We have 

{G(x,y) - <P 2n (hb 2n -x,y)}V G P(y)dv(y) < c2~™"||(A*) r P|| p < c\\P\\ p . (6.2) 

p 

Ln addition, if v is a quadrature measure of order A2 n , and R > 0, then 

$2n{hb2n;x,y)V G P(y)dv(y) - [ ^(hb 2 n;x,y)V G P{y)d f i(y) 

: Jx 

< c(R)2- n( - R+r) \\(A*) r P\\ p <c(R)2- nR \\P\\ p , (6.3) 

and 

||P - G(i/; P)\\ P < C 2—||(A*) I -P|| P < c\\P\\ p . (6.4) 
// < 7 < (3 — a/p' , and 7 < r < (3, then 

\\(A*yP-(A*y<G(iy;P)\\ p <c2- n ^\\(A*) r P\\ p . (6.5) 

PROOF. Since V G P £ II 2 », we conclude using ([02")) , ([534]) . and (jS35)) with 2 _ ™ in place of d, 2 n in 
place of L and r in place of 7 that 

||P G P|k-x, P < c\\V G P\\ p < c2 n (' 5 - f -)||(A*)''P|| p < c2^\\P\\ p . 

The estimate (|6.2j) follows from this and Proposition 15 . 2f b) . used with m = n, V G P in place of F. 
Next, for each x £ X, (|5.37|) (with P + (3 in place of R) and the last estimate in (|5.11|) imply that 



$ 2 n(hb 2 n;x,y)V G P(y)dis{y) - / ^{hb 2n ;x,y)V G P{y)d^{y) 

Jx 

< c (i?)2-"^+«||<f> 2 „(^ 2S x,o)|| 1 ||P G P|| p < Cl (P)2~"( i?+r )||(A*rP|| p . 

This proves the first inequality in (16.3|k the second follows from (|5.33p . 

In this proof only, we write v = /i — v, and observe that H^Ha*) _ n < c. In view of (|3.1|) . we obtain 



P(x)-G(^;P,ar) = / G(x,y)V G P(y)di>(y) 
Jx 

{G(x, y) ~ (/i& 2 » ; a;, y)}V G P{y)dv{y) + J $ 2 - (/162- ; x, y)V G P{y)dD(y). (6.6) 
Using the first estimate in (|6.2[) with £ in place of ^, we obtain 



{G(x, y) ~ $ 2 n ; x, y)}V G P{y)di>{y) 



< c2-" r |l(A*) r P|| 



(6.7) 



Together with ([BIS]) . (|6~B| . this implies ([6^4]) . 

In the remainder of this proof only, let Gj(x,y) be defined formally by 
Gj(x, y) = 5Z°^ (1 +£j)' y b(£j)4>j(x)(j)j(y). Then G 7 is clearly a kernel of type /? — 7 > a/p'. Let P G il c 
For y £ X, we have 



P G P(y) = £ 

3=0 



3=0 



{p,^)(i + e,r 
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Consequently, we obtain for i£X, 

(A*yG(G;u;P,x) = / G 1 {x,y)V G P{y)dv{y) = [ G 7 (z,y)V G ^{A*yP){y)du(y) 
Jx Jx 

= G(G 7 ;i/;(A*)TP). 

The estimate (|6.5p now follows easily from (|6.4p , used with (A*) 7 P in place of P, r — 7 in place of r. □ 

Proof of Theorem 16.11 (and hence, Theorem 13. 1J) . In this proof only, let n be the greatest integer 
such that 2™ < L. Then v* is also a 2~ n - regular quadrature measure of order 2A2 n , and ||^*||x _„ < c. 
In view of Proposition ES^b), ||(A*) r P|| p < c|| (A*) r /|| P - Our choice of P and (0 now imply tfHTTjl . □ 

Proof of Theorem 13.21 We note that in our current notation, Gl(/) = G(j/*; <r L (f)). We let n be as 
in the proof of Theorem 16. 11 Hence, using (|6.4p and Proposition I5.3f a) . we obtain 

||/-G(i/Vz(/))||p < \\f-a L (f)\\ p +\\a L (f)-G(p*;a L (f))\\ p 

< \\f-<T L (f)\\ p + cL- r \\(A*) r a L (f)\\ p <aj r {p;f,l/L). (6.8) 

Since it is obvious that ui r (p; /, l/L) < ||/||p (by choosing fx — in the definition of w r ), this implies also 
that \\G{v*;a L {f))\\ p <c\\f\\ p . 

Using (|6.5[) with r = 7 and <7i(/) in place of P, we obtain 

||(ATG(^Vl(/)) - (A*)V L (/)|| P < c||(A*)V L (/)|| p . 

Hence, using Proposition 15. 3f a) again, 

||(A*rG(i/*;«7 i (/))|| p < c\\(A*) r a L (f)\\ p < cL r u r (p;f, l/L). 

Together with (jBTgj) . this implies (|3~TU|) . 

Next, we turn to part (b). In this part of the proof, let v be the measure that associates the mass w y 
with each y 6 C, so that IMIwii/i, < c - Then in our current notation, 

Gi(C;W;/) = G(C*; W*; <r L (C; W; /)) = G(v*;* L (v; f)). 

Using ([631), flOT]) with d = l/L, H = h, we obtain 

||G(i/*;<Ti(i/;/))|| p < c||ox(f;/)|| p < c||/|| V! x )P . 

This proves (|3 . llj) . The proof of (|3 . 1 2[) is the same as that of (13 . 1 Of) . except that we have to use 
Proposition 15. 3[ c) instead, and the estimates are accordingly as claimed. □ 

During the rest of this section, we assume that (|3.14j) (and hence, by Lemma [5~2l (I5.23p ) holds. Next, 
we prove Theorem 13.41 This will be done using Lemma 15.31 and the following general statement about 
the inverse of matrices. Proposition 16. II is most probably not new, but we find it easier to prove it than 
to find a reference for it. 

Proposition 6.1 Let M > 1 be an integer, A be an M x M matrix whose (i,j)-th entry is Ajj. 
1 < P < 00, and 7 G [0, 1). If 

M M 

El^-I^IAwl. El^l^Ti4«l> j' = v--,m, (6.9) 

i=l i=l 

and X = mini<i<M \ > 0, then A is invertible, and 

HA-Vll* < ((l-7)A)- 1 ||y||,P, yeR M . (6.10) 
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Proof. Let a = (ax, ■ ■ ■ ,clm) & Mr, and y = Aa. First, we consider the case p — oo. Let j* be the 
index such that \cij* \ — ||a||^oc. Then, in view of the first estimate in (|6.9p . we have 



> 



> 



\Vj* 



A I 



M 



1=1 



1^.^1(1 -7)l|a||^ >(l- 7 )A||a|U». 



Therefore, A is invertible. For every y, there exists a = A _1 y. Applying the above chain of inequalities 
with this a, we have proved (|6.10|) in the case p = oo. 
Next, using the second estimate in (|6.9p . we obtain 



A I 



A I 



> 



At 

E Ai,j<*j 
AI M 

EE '• ■ 



Ew = E 

i=l i=l 
M 

^2\ A 3.j\\ a i\ - 

3=1 
M 

> ^|A JJ |(l- 7 )K|>A(l- 7 )||a|| 



3 = 1 



This proves (|6.10p in the case p = 1. 

The intermediate cases, 1 < p < oo, of (|6.10[) follow from the Riesz-Thorin interpolation theorem. □ 

PROOF of Theorem 13.41 In this proof only, let * = J2ijtc a yG(°> v), an d m be chosen so that 
2 m > c\q~ x and (|5.29p holds. Then, with g as defined just before Lemma [5T3l 



$ 2m (g; ¥, x) = E 9(tj/2 m ) E a v b (W<f>j iv)<l>j (*) = E a ^ 2 '" ^ 5 

In this proof only, let d denote the vector ($ 2 ™ (5; x ))xec, where all vectors are treated as column 
vectors, and A denote the \C\ x \C\ matrix whose (x,y)-th entry is given by $2™ (362™; x, y). Then (|5.29[) 
implies that is satisfied with 7 = 1/2. Also, (jOg) implies that min^c ^x.x > c 2 m(Q ~' 3 ), ieC. 

Therefore, Proposition 16. II shows that A is invertible. Further, (|6.10p implies that 

||a|U* <c2 m ^-^\\d\\ lv . 

Now, let v be the measure as in Lemma \5. 31 Then v £ A4 q . So, (|5 .32[) shows that for 2 m > Cx/q, 



— n~ a /P\ 



In view of ([^4]) applied with g in place of H, \\$ 2 m(g; ty)\\ p < c||*|| p . Hence, for 2 m > cx/q, 

We may now choose m with 2™ ~ to arrive at (|3.15p . □ 

Next, we turn our attention to the proof of Theorem 13.31 Towards this end, we recall the following 
theorem ([5J Chapter 7, Theorem 9.1, also Chapter 6.7]). Our assumption about the centers C m in the 
definition of the spaces V m being nested implies that the sequence of spaces {V m } satisfies the conditions 
listed in [8l Chapter 7, (5.2)] with the class X p in place of X in [8], where the density assumption can 
be verified easily using (|3.ip and the fact that 5{C m ) — > as m — * 00. The statement of [5J Chapter 7, 
Theorem 9.1] is in terms of the Besov spaces in general, we apply it with the parameter q — 00 there. 
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Theorem 6.2 Let 1 < p < oo, r > 0. Suppose that for some r > 0, 

dist(F,V m ) <cm- r \\(A*) r F\\ p , m = 1, 2, • • ■ , F g 

and 

||(A*) r *||p<«n r ||*|| pj $6V m ,m = l,2,-. 
T/ien /or < 7 < r, F g ffi ?/ and only i/sup m>1 to 7 d/st (F, V m ) < c(F). 



(6.11) 
(6.12) 



Theorem 16.11 (used with C m , W TO in place of C* , W* respectively) already shows that (|6.1ip holds. 
Thus, to complete the proof of Theorem 13. 3[ we need to establish 



Theorem 6.3 Let 1 < p < 00, < r < (3 — a/p' , C C X be a finite set, q = q(C), and {a y } y ^c C K. 
Then 

|| (AT ]T a y G(o, y)\\ p < cq- r \\ £ a y G(o, y)\\ p . (6.13) 

yec yec 

PROOF. Let v g A4 q be the measure as in Lemma I5T31 In this proof only, let \I> = J2 y ec a yG(°, y)- Then 
Proposition [52] (b), used with n — [log 2 (1 / g)J , shows that for any F : C — ► K, 



{G(o,y)-<!> 2m (hb 2m ;o,y)}F(y)dv(y) 



!/t> 



< c2~ mP .2 a{m - n)/p ' \\F\\ v .j 



Using 2 ™ ~ q, and the function F defined by F(y) = a y , y g C, this translates into 



^*-g a ^a^> 2m (M, 2m ;o,y) 



< c2- ml3 (q2 m ) a/p 'q a/p \\a\\ eP 



$ - ^2a y $2">{hb 2 m;o,y) 
yec 



< c 2- m (P~ a /p')\ 



In view of (|3.15|) , this yields 



*-^a y $ 2 ™(M>2»;°,y) 



(6.14) 



Next, we note that the function b r (t) := (1 + \t\) r b(t), t g R, is a mask of type (3 — r, and also that 
(A*) r G(o, ;/ ) = G(b r ;o,y), y g X. Similarly, (A*) r $ 2 - I °, y) = $2™(Mr,2™;o,!/). Hence, we may 
apply (|6. 14|) with (A*) r G(o, y) in place of G, /3 — r in place of /3, and deduce that 



(A*r*-(A*r^ % $ 2m (W 2 ™;o,j/) 



< c2- m ( /3 -' r - Q / ?, ')o Q / p '-' 3+r ||(A*) r *||p. (6.15) 



We now choose m sufficiently large, so that 2 m ~ l/q, and c 2-™(/ 3 -'- Q /? ) ') ( j"/p'-/3+f < 1/2. Then (pT4]) . 
(|6.15p become 



and 



(at*-(at£<h,*2«(m>2»;°,v) 



< c||*| 



< (i/2)[[(A*r*n P . 
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Since *^2 y< z C a y §2 m {hb2 m ] 1 y) € 112", these estimates and (|5.33|) lead to 



||(AT*Hp<2 



(A 



y ec 



a„^ 2m (hb 2m ;o,y) 



< cT 



y&C 



(hb 2 ™;o,y) 



< c2 r 



Since 2 m ~ 1/q, this implies (|6T3]l . □ 

PROOF of Theorem 13.31 We note that Theorem 16.21 is applicable in view of Theorem 16.11 and The- 
orem [fT31 The equivalence (a)o-(c) follows from Theorem 16.21 The implication (a)=^(b) follows from 
Theorem 13.21 The implication (b)=^(c) is clear. 

In the case when p = oo, the implication (d)=^(c) is clear. The implication (a)=^(d) follows from 
Theorem GO] □ 



PROOF of Theorem 13.51 Using (|6.5|) . Theorem 16.31 (used with 7 in place of r), and Theorem 16. 11 we 
obtain 



|(A*)Vm(/)-(AT* 



m \\p 



< 
< 
< 
< 



||(A*)-V m (/) - (A*)m m (f)\\ p + ||(A*)TG TO (/) - (A*)t* to || p 

c{mT- r ||(A*)Vm(/)||p + m^||G m (/)-* m || p } 

c {m^1(AT/ll P + nVWf ~ G ro (/)|| p + m^\\f - * m || p } 

cmi- r \\(Arf\\ P . 



In view of Proposition [5731 this leads to the desired estimate. 

We end this section with the postponed proof of Proposition 12.11 



□ 



PROOF of Proposition 12. 11 In order to prove part (a), let (in this proof only) C = {xk}%Li- We define 
CI = C n A(xi,e). By relabeling the set if necessary, we choose x 2 € C*, and set C£ = ("I A(x 2 ,e). 
Necessarily, p(x±,C 2 ) > e and p(x\ 1 x 2 ) > e. Since C is finite, we may continue in this way at most M 
times to obtain a subset C of C such that q(C) > e, and moreover, for any x £ C, there is y 6 C with 
p{x,y) < e; i.e., 6(C,C) < e. It follows that 

6(C) < 6(C) < 6(C) + e. 

This completes the proof of part (a) . 

To prove part (b), we will use some notation which will be different from the rest of the proof. In 
view of the fact that S(Ci) < (l/2)5(Co) < q(d), the points of Cq are already at least 6(Ci) separated 
from each other. Let Cf be the subset of d \ Co comprising points which are at least 6(C\) away from 
any point in Cq. Let C^ C Cf be selected as in part (a), so that 



8{Ct,C*)<8{Cx)<q{Ct), 



(6.16) 



and C{ := U Co- Clearly, C\ 3 Co, and q(C\) > 5(d). If x £ C\ and there is no point of Co within 
5(Ci) of x, then x G Cf . In view of (16.16|) . there is a point in within 6(d) of x. So, in any case, for 
any x £ Ci, there is a point in C* within 8(C\) of x. Therefore, 

6(Ci) < 6(Cl) < 26(d) < 2q(C{). 

This completes the proof of part (b) . 

To prove part (c) , we note that there exist integers I, n > such that 



(2'fc)- 1 < 6(C k ) < (2- n k)~\ 



1.2, 



(6.17) 



In this proof only, we define C' k = C 2 k{i+ n +i) , k = 0, 1, 2, ■ • -. Then it is clear that C' k C C' k+1 and it is easy 
to check using (|6. 1T[) that 6(C' k+1 ) < (l/2)6(C k ). With the construction as in the proof of part (a), we 
7" C C' 

6(Ci 1 )<(l/2)6(Ci)<(l/2)6(C>i)<q(Co'). 
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We then use part (b) with C ' in place of Co of part (b) and C[ in place of C\ of part (b) to obtain C" C C[ 
such that C'{ 2 (%, S(C[) < 8{C%) < 2S(C{) < 2q(C'{), and 8(C 2 ) < (1/2)S(C[) < {1/2)5{C'{). Proceeding 
by induction, we construct an increasingly nested sequence {C k C C' k } with 8{C'l) < 2S(C' k ) < 2q(C k ). We 
observe that 

If m > 1 is any integer, we find integer k such that 2 fe ( £ +™ +1 ) < m < 2^ k+1 ^ e+n+1 \ and define C m = C k . 
Then C m 3 C 2 k(t+ n +i) = C' k D C^,' 2 C m . Moreover, since the value of A: corresponding to m does not 
exceed that corresponding to m + 1, and the sequence {C k } is increasingly nested, then C m C C m +i. It 
is easy to verify from (|6.18|) that <5(C m ) < 2<5(C m ) and that 8{C m ) ~ 1/m. □ 
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